Burrows-Wheeler transform and LCP array construction in constant space

نویسندگان

  • Felipe Alves da Louza
  • Travis Gagie
  • Guilherme P. Telles
چکیده

In this article we extend the elegant in-place Burrows-Wheeler transform (BWT) algorithm proposed by Crochemore et al. (Crochemore et al., 2015). Our extension is twofold: we first show how to compute simultaneously the longest common prefix (LCP) array as well as the BWT, using constant additional space; we then show how to build the LCP array directly in compressed representation using Elias coding, still using constant additional space and with no asymptotic slowdown. Furthermore, we provide a time/space tradeoff for our algorithm when additional memory is allowed. Our algorithm runs in quadratic time, as does Crochemore et al.’s, and is supported by interesting properties of the BWT and of the LCP array, contributing to our understanding of the time/space tradeoff curve for building indexing structures.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Two Space Saving Tricks for Linear Time LCP Array Computation

In this paper we consider the linear time algorithm of Kasai et al. [6] for the computation of the Longest Common Prefix (LCP) array given the text and the suffix array. We show that this algorithm can be implemented without any auxiliary array in addition to the ones required for the input (the text and the suffix array) and the output (the LCP array). Thus, for a text of length n, we reduce t...

متن کامل

Low Space External Memory Construction of the Succinct Permuted Longest Common Prefix Array

The longest common prefix (LCP) array is a versatile auxiliary data structure in indexed string matching. It can be used to speed up searching using the suffix array (SA) and provides an implicit representation of the topology of an underlying suffix tree. The LCP array of a string of length n can be represented as an array of length n words, or, in the presence of the SA, as a bit vector of 2n...

متن کامل

Longest-Common-Prefix Computation in Burrows-Wheeler Transformed Text

In this paper we consider the existing algorithm for computation of the Longest-Common-Prefix (LCP) array given a text string and its suffix array and adapt it to work on Burrows-Wheeler Transform (BWT) text. We did this by a combination of pre-processing steps and improvement based on existing algorithm. Three LCP array computation algorithms were proposed, namely LCPB-A, LCPB-B and LCPB-C tha...

متن کامل

Lightweight LCP construction for very large collections of strings

The longest common prefix array is a very advantageous data structure that, combined with the suffix array and the Burrows-Wheeler transform, allows to efficiently compute some combinatorial properties of a string useful in several applications, especially in biological contexts. Nowadays, the input data for many problems are big collections of strings, for instance the data coming from “next-g...

متن کامل

Two space saving tricks for linear time LCP computation

In this paper we consider the linear time algorithm of Kasai et al. [10] for the computation of the LCP array given the text and the suffix array. We show that this algorithm can be implemented without any auxiliary array in addition to the ones required for the input (the text and the suffix array) and the output (the LCP array). Thus, for a text of length n, we reduce the space occupancy of t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Discrete Algorithms

دوره 42  شماره 

صفحات  -

تاریخ انتشار 2017